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The thermodynamic behavior and structural properties of hydrophobic-polar (HP) lattice pro- 
teins interacting with attractive surfaces are studied by means of Wang-Landau sampling. Three 
benchmark HP sequences (48mer, 67mer, and 103mer) are considered with different types of sur- 
faces, each of which attracting either all monomers, only hydrophobic (H) monomers, or only polar 
(P) monomers, respectively. The diversity of folding behavior in dependence of surface strength 
is discussed. Analyzing the combined patterns of various structural observables, such as e.g., the 
derivatives of the numbers of surface contacts, together with the specific heat, we are able to iden- 
tify generic categories of folding and transition hierarchies. We also infer a connection between 
these transition categories and the relative surface strengths, i.e., the ratio of the surface attractive 
strength to the inter-chain attraction among H monomers. The validity of our proposed classifi- 
cation scheme is reinforced by the analysis of additional benchmark sequences. We thus believe 
that the folding hierarchies and identification scheme are generic for HP proteins interacting with 
attractive surfaces, regardless of chain length, sequence, or surface attraction. 

PACS numbers: 05.10.-a, 82.35.Gh, 87.15.ak, 87.15.Cc 



I. INTRODUCTION 

Protein adsorption on solid surfaces has attracted 
strong research interest recently for its numerous appli- 
cations in nanotechnology and biomaterials [1, 2]. The 
study of protein functions in experiments often involves 
the immobilization of proteins jsl, Adhesion of pro- 
teins on solid substrates including metals, semiconduc- 
tors, carbon or silica etc., enables the synthesis of new 
materials for biosensors or electronic devices It is 

also the key to reveal the principles of many biological 
processes and causes of diseases, e.g., when integrating 
implanted materials with body tissues [ol, |To|- Under- 
standing how the functions and conformations of a pro- 
tein are affected by adsorption and de sorp tion is an im- 
portant topic in protein drug delivery [ll|, |l2| . 

It is known that configurational changes of protein 
molecules upon surface adsorption depend on both the 
protein's properties (e.g. sequence, size, thermodynamic 
stability, etc.) and the surface properties (e.g. mate- 
rials, polarity, surface roughness, etc.); but how large 
these changes are and where in the protein molecules 
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they occur remain puzzles to be solved [l|, [T3|. Enor- 
mous efforts have been dedicated to unveil the mysteries 
in protein folding and adsorption mainly by experimen- 
tal approaches [13, [HI- Nevertheless, the fact that only 
the "final product" can be obtained and studied in an 
experiment makes for slow progress in understanding the 
dynamics and folding processes. From another point of 
view, the diversity of possible protein sequences and so- 
phisticated interactions among amino acids also compli- 
cate theoretical structural prediction in protein folding - 
not to mention when the protein interacts with solvent 
molecules or a substrate where an extra level of complex- 
ity enters. 

With the advances in computer power, numerical sim- 
ulation has become a promising way to shed more light 
on the general problem. The study of simplified, coarse- 
grained protein models in conjunction with Monte Carlo 
simulations, being able to efficiently explore large con- 
formational phase spaces [l6|, U^, has been a particu- 
larly successful approach in understanding the principles 
behind protein folding and adsorption. Among these 
minimalist models, the hydrophobic-polar (HP) model 
introduced by Dill et al. [l8|, [l9| has been a particu- 
larly active, yet difficult, research subject in recent years. 
The interest in the HP model is twofold: (i) Despite its 
known limitations [20|, [2l| , the model captures some of 
the most important qualitative features which drive the 
folding of proteins and characterize their native states. 
It thus laid a basis to systematically study many prob- 
lems in protein folding by means of computer Simula- 
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tions. (ii) The ground state search for an HP sequence is 
an NP-complete problem [22|, [23| , and the rough free en- 
ergy landscapes cause traditional Monte Carlo methods, 
e.g. Metropolis sampling plj, to fail in the low tempera- 
ture regime Therefore, the HP model is a well-suited 
testi ng gro und for a number of emerging numerical meth- 
ods |26l-[34|. In addition, the thermodynamic behavior of 
different HP sequences can vary noticeably, even for the 
same chain length [32|. Consequently, finite-size scal- 
ing cannot be applied to a systematic study of the effect 
of system size, unlike many other models in statistical 
physics. Thus, our understanding of the general behav- 
ior of the model is still incomplete and the simulation 
of the HP model remains a challenging computational 
problem in statistical physics. 

Various approaches have been undertaken to better 
understand the energy landscapes, thermodynamics and 
conformational transitions of HP proteins adsorbing on 
an attracting surface (sBMioj. In this work, our intent is 
to identify generic thermodynamic and structural behav- 
ior of protein adsorption from a macroscopic perspective 
using the HP model. We have adopted Wang-Landau 
sampling (4ll-[43| to obtain the density of states of a sys- 
tem, from which subsequent thermodynamics can be cal- 
culated. To the best of our knowledge, it is the first com- 
prehensive analysis of structural transitions for protein 
adsorption that integrates results from multiple HP se- 
quences. Previous work by Bachmann and Janke 1391 . 
Swetnam and Allen [40|] or Radhakrishna et al. [4^ 
only studied the conformational pseudo-phases based on 
individual benchmark HP sequences. Comparing the 
thermodynamic and structural properties of multiple se- 
quences allowed us to identify categories of generic tran- 
sition behavior and to draw a correspondence between 
these categories and the relative interaction strengths in- 
volved. This finding is fundamental as it implies that 
different HP sequences share certain general, qualita- 
tive, characteristics in structural transformations when 
brought near to an adsorbing substrate. 

The article is organized as follows: Section HIl describes 
the model employed. Section IIIII explains the sampling 
method and details. Section |IV] presents the simulation 
results. Section |V] discusses the analysis and interpreta- 
tion of the results, and Section IVll gives a summary and 
outlook of this work. 



II. MODEL 

The commonly known 22 amino acids found in proteins 
can roughly be classified as either hydrophobic or polar 
depending on the nature of their side chains. The ten- 
dency of the non-polar residues to stay away from water 
molecules has been identified as the key driving "force" 
in forming tertiary structures. The hydrophobic-polar 
(HP) model [H, is a coarse-grained lattice model for 
proteins that captures this hydrophobic effect. In this 
model, an amino acid is treated as a single monomer and 



it can only be hydrophobic (H) or polar (P). A protein is 
thus represented by a heteropolymer which consists of N 
connected monomers of type H or P. An attractive inter- 
action exists only between a pair of non-bonded nearest 
neighboring H monomers. This attraction is denoted by 
£hh in our discussion, and the magnitude indicates the 
ability of the H monomers to pull themselves together as 
determined by the insolubility of the protein in an aque- 
ous environment. In other words, the solvent quality is 
intrinsically considered by the model. Other factors like 
charges and acidity of amino acids that also govern pro- 
tein folding are not handled in this scope. 

In addition to the internal interactions within the poly- 
mer, the binding of a protein with an attractive sub- 
strate contributes to the total energy. We have consid- 
ered three types of surface fields in view of the setting 
of the HP model: (i) a surface interacts only with H 
monomers (a hydrophobic surface) with strength Ssh, 
(ii) a surface interacts only with P monomers (a polar 
surface) with strength ssp, and (iii) a surface interacts 
with both H and P monomers with equal strength, i.e., 
^SH = £sp 7^ 0. The energy function of an HP sequence 
interacting with a surface takes the general form: 

E = -SHHriHH - SsHTiSH - SSP^SP, (1) 

where uhh denotes the number of interacting pairs be- 
tween H monomers, ush the number of surface contacts 
with H monomers and nsp the number of surface con- 
tacts with the P monomers. Besides contributing to the 
system's energy, these three quantities are also useful "or- 
der" parameters that give quantitative measures of the 
structure of a conformation [45[. The negative signs in 
front of each term indicate that it is energetically favor- 
able when the monomers interact or come in contact with 
the surface. 

Our simulations were performed on a three- 
dimensional simple cubic lattice with the attractive 
surface being the x?/-plane placed at z = 0. A non- 
interacting steric wall is placed at z = TV + 1 to confine 
the polymer in a way that it can contact both walls 
with its ends only when it is a vertical straight chain. 
Periodic boundary conditions are imposed in the x and 
y directions. 

III. METHOD 

A. Calculation of thermodynamic quantities 

The partition function, Z, at a particular temperature 
T can be expressed in terms of the energy density of 
states g{E): 

Z = Y,9{E)e-^/''-^, (2) 

E 

where E is the energy of the system as defined by the 
energy function, /c^ is the Boltzmann constant, and the 
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sum runs over all the energies that the system can take. 
Since g{E) does not depend on T, one may calculate 
Z at any temperature with a single estimate of g{E). 
All the thermodynamic quantities then follow from the 
knowledge of Z. For example, the average energy {E) 
and the heat capacity Cv are calculated as: 



Cv 



(3) 



(4) 



The specific heat is then defined as Cy jN accordingly. 



B. Wang-Landau sampling and trial moves 

Wang-Landau (WL) sampling is a powerful, iterative 
algorithm to estimate the density of states, g(E\ with 
high accuracy. Details of the algorithm are described in 
Refs.(4ll-[43j. In our simulations, rather stringent param- 
eters were chosen in order to obtain accurate estimates 
for g{E)\ We used a fiatness criterion p = 0.8 for the 
48mer and p = 0.6 for the 67mer and 103mer. The final 
modification factor was set to ln{f final) = 10~^ in all 
cases. 

It has been found that two types of trial moves, pull 
moves [46| and bond-rebridging moves [i^ , work partic- 
ularly well together with WL sampling in search of the 
global energy minimum conformations and the determi- 
nation of the density of states for lattice polymers (isl - 
[5Q| . The ability of reaching low energy states allows for a 
more thorough survey of conformational space, yielding 
a higher resolution of g{E) and thus more precise ther- 
modynamic quantities especially in the low temperature 
regime. This is of particular importance for longer chain 
lengths with more complex energy landscapes [50]. 

The two trial moves are called with different probabil- 
ities. Bond-rebridging moves transform a polymer from 
one compact state to another without uncoiling, making 
it more efficient than pull moves in dealing with densely 
packed polymers. However, since the energy difference 
resulting from a bond-rebridging move is relatively large, 
its acceptance rate is rather low. This drawback is com- 
pensated for with a higher calling ratio. In our simula- 
tions, we used move fractions of 80% and 20% for bond- 
rebridging moves and pull moves, respectively. 

In order to fulfill detailed balance when employing pull 
moves, an extra factor is added to the acceptance proba- 
bility of moving from state A (with energy Ea) to state 
B (with energy Eb)- 
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V g{EB) riA^B riA J 
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which can be performed from state A; ub^a and ub are 
defined likewise. Because of reversibility of pull moves, 
riA^B = TiB^A and the two terms cancel out in Eq. 
(|5j). When a pull move is chosen to generate a new con- 
figuration, a list of possible moves from state A is first 
constructed to obtain ua- A move is then selected ran- 
domly from the hst, generating state B. ub is counted 
and P{A B) can then be calculated. This procedure 
is computationally expensive (it slows down the simula- 
tion by approximately an order of magnitude); however, 
it secures the reliability of our results by taking detailed 
balance into careful consideration. 

We adopted RANLUX as the random number gener- 
ator (gsl_rng_ranlxd2), which is recommended by the 
GNU Scientific Library (GSL) for its "reliable source of 
uncorrelated numbers" and "strongest proof of random- 
ness" [5l|. It uses a lagged-Fibonacci-with-skipping algo- 
rithm [525 with double precision output and a long period 
of about 10^^^ 

The use of Wang-Landau sampling together with in- 
ventive trial moves is essential for uncovering the subtle, 
low temperature thermodynamics of these systems. This 
is demonstrated in Fig. [1] which shows the specific heat 
of a 36mer (P3H2P2H2P5H7P2H2P4H2P2HP2) interact- 
ing with a very weak attractive surface {ssh = £sp = 
^^£hh = 12). As seen in the figure, all the surface related 
transitions below kBT/enH ^ 0.3 are completely inac- 
cessible by Metropolis sampling, even with very long runs 
(10^ trial moves!) incorporating pull moves and bond re- 
bridging moves! In contrast, Wang-Landau sampling is 
even able to uncover the shoulder at kBT/sHH ^ 0.016 
shown in the inset (caused by a rapid change in configu- 
rational entropy of the excitation from the ground state 




where ua^b is the number of pull moves that trans- 
form A to B; TiA IS the number of possible pull moves 



kjl &^ 



FIG. 1. (Color online) Main graph: Specific heat of a 36mer 
interacting with a very weak attractive surface as obtained by 
Wang-Landau and Metropolis sampling. Where not shown 
statistical errors are smaller than the data points. Compa- 
rable computational effort was used for each method. Inset: 
Results at extremely low temperature. Typical configurations 
above and below the shoulder are shown. 
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to the first few excited states; see Ref. [ssj for details). and 

We, therefore, stress that sophisticated Monte Carlo al- ^ 

gorithms are crucial in simulating systems with delicate (Q) = ^^^Qg{E,Q)e~^^^^^ . (10) 

low temperature behavior. ^ q 

Recently, improvements have been proposed to speed 
up and ensure the convergence of WL sampli ng in simu- 
lating lattice polymers or proteins ^ H, [50|, [Hi . IV. SIMULATION RESULTS 



C. Calculation of thermodynamics of structural 
observables 



Structural quantities are essential in understanding 
non-energetic properties of the system. They provide in- 
formation on the structures and packing of the polymer. 
Apart from the quantities uhh^ ^sh and nsp introduced 
in Section [TTl structural quantities that are often of in- 
terest include the radius of gyration. 



Rn 



\ 



1 ^ 

i=l 



12 



and the end-to-end distance. 



Re 



\rN - ri| , 



(6) 



(7) 



where fcm in Eq. ([6]) is the center of mass of the config- 
uration; fi is the position of monomer i. 

To obtain the thermodynamics of a structural observ- 
able Q, we estimated the joint density of states, g{E^ Q), 
by multicanonical sampling [Hll, [56|. Although it is 
feasible to sample g{E^Q) all over again using a two- 
dimensional random walk in WL sampling if only one 
structural quantity is required, it becomes impossible if 
several of them are of interest. A more efficient way is to 
make use of the prior knowledge of g{E) and perform a 
multicanonical sampling. In this process, trial states are 
accepted or rejected according to l/g{E), where g{E) is 
the one obtained previously by the one-dimensional WL 
sampling and is held fixed throughout the whole produc- 
tion procedure. As a new trial state is accepted, any 
desired structural quantity Q would be calculated and 
accumulated in a two-dimensional histogram, H{E^Q). 
The simulation is brought to an end when a sufficiently 
large number of Monte Carlo steps have been performed. 
The joint density of states, g{E^ Q), is then obtained by 
reweighing H{E^ Q): 



g{E,Q)=g{E)H{E,Q). 



(8) 



As such, we can obtain as many g{E^ (5)'s for various Q's 
as desired in a single production run. 

The partition function, Zg, for observable Q and its 
expectation value can then be obtained as 



E,Q 



(9) 



We have studied three benchmark HP sequences 
(48mer, 67mer, and 103mer) interacting with three types 
of surfaces (see above). The 48mer (PHPHP4HPHPHP2- 
HPH6P2H3PHP2HPH2P2HPH3P4H) is seq. 9 among 
the ten "Harvard sequences" designed originally for 
algorithm testing purpose [s^, where the number 
of H and P monomers are exactly the same in each 
sequence, i.e., 24 H monomers and 24 P monomers 
respectively. The one chosen here for our simulation 
has the minimum ground state degeneracy in 3D 
free space [32|, The 67mer (PHPH2PH2PHPP- 

H3P3HPH2PH2PHP2H3P3HPH2PH2PHP2H3P3HP- 
H2PH2PHP2H3P) was first introduced to resemble 
a//3-barrel in real proteins j58|, while the lOSmer 
(P2H2P5H2P2H2PHP2HP7HP3H2PH2P6HP2HPHP2H- 
P5H3P4H2PH2P5H2P4H4PHP8H5P2HP2) was proposed 
to model cytochrome c [59| . 



A. Ground states and limiting behavior 

Table [J reports the lowest energies found for these three 
sequences during the estimation of g{E) with different 
types of attractive surfaces. To understand the "asymp- 
totic" folding behavior in the limit of a surface with infi- 
nite attractive strength, we simulated the case where the 
HP chain was confined to a two-dimensional free space. 
This is equivalent to restricting all monomers of the HP 
chain on the surface, giving a 2D ground state. Another 
limiting case is when the surface is totally absent in a 
three-dimensional space. For more details on these two 
cases, see [6Q| . 

These two limiting cases are useful in visualizing up- 
per and lower bounds for thermodynamic observables and 
they serve as an aid to understand the details of folding 
behavior. A demonstrative example is the averaged ra- 
dius of gyration per monomer, (Rg) /N ^ shown in Fig. 
[2] for the 48mer interacting with a surface attracting all 
monomers (surfaces Al, A2 and A'^/2). The radii of gy- 
ration of the two limiting cases are plotted on the same 
figure. Drawing a simple connection to the self-avoiding 
random walk on square and cubic lattices, it is obvious 
that {Rg) is largest when all the monomers are forced to 
sit on the surface to form planar structures, yielding the 
upper bound for {Rg). Correspondingly, {Rg) is smallest 
when the HP chain is allowed to fold freely in a three- 
dimensional space to form 3D structures, giving the lower 
bound of {Rg). Generally speaking, when the HP chain 
is placed near a surface of finite attractive strength, it 
remains as an extended coil at high temperature as if the 
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TABLE I. Lowest energies found for the 48mer, 67mer, and 
103mer interacting with different attractive surface types and 
strengths, which are abbreviated in the surface labels (A, H or 
P stand for the surface types, the numbers stand for the ratio 
between esH or ssp and shh). Classification of transition 
categories are denoted by the Roman numbers in parentheses. 



surface shh ssh ssp 

Free space without surf 
2D 1 / / 
3D 1 / / 


48mer 
ace I 

-21 
-34 


67mer 

-29 
-56 


103mer 

-32 
-58 


Surfaces attract all mor 
Al 1 1 1 

1 2 2 

2 1 1 


lomers: 

-69 (I) 
-117 (I) 
-93 (II) 


-96 (I) 
-163 (I) 
-132 (II) 


-135 (I) 

/ 

-167 (I/II) 


Surfaces attract only H 
ifl 1 1 
if2 1 2 

m/2 2 1 


monomers: 
-49 (II) 
-73 (I) 
-79 (III) 


-72 (II) 
-108 (I) 
-118 (II) 


-80 (II) 

/ 

-128 (III) 


Surfaces attract only P 
PI 1 1 
P2 1 2 
Pi/s 2 1 


monomers: 
-48 (II) 
-71 (I) 
-79 (III) 


-69 (II) 
-91 (I) 
-123 (II) 


-100 (I/II) 

/ 

-150 (II) 




I I I I I I I I I I 1_ 

0.0 1.0 2.0 3.0 4.0 5.0 



surface is absent. The radii of gyration for all cases thus 
coincide with the 3D, surface- free one. 

As the temperature decreases, the HP chain interact- 
ing with a stronger attractive surface (^42) starts the ad- 
sorption process the earliest at kBT/sHH ^ 5.0 as its 
(Rg) "departs" from the lower bound and begins to ap- 
proach the upper bound. Such an adsorption "transi- 
tion" is clearly signaled by the peak in Cy centered at 
kBT/sHH ~ 4.25. At kBT/sHH ~ 1-0, (Rg) merges 
with the upper bound signifying a complete adsorption 
of all monomers. The formation of a hydrophobic core in 
which the number of inter-chain H-H interactions, uhh^ 
is maximized, then takes place entirely on the surface in 
this case until the ground state is reached at zero tem- 
perature. This process in the low temperature regime 
is identical to the one in two-dimensional free space, as 
indicated by the complete agreement in the radii of gyra- 
tion and the coincidence of the peak at kBT/sHH ^ 0.5 
observed in Cy- 

The thermodynamics for surface Al is qualitatively 
similar to that of surface A2 except that it requires a 
lower adsorption temperature. Since the radii of gyra- 
tion for both surface types end up with the same value 
as the upper bound at T = 0, one may expect that the 
ground state conformations for both systems are two- 
dimensional. This has been confirmed by the number of 
surface contacts {tish = ^sp = 24, meaning the entire 
chain is in contact with the surface) and the number of 
H-H interactions {uhh = 21, which is the same as the 
ground state of the 2D limiting case). 



FIG. 2. (Color online) Upper panel: Specific heat Cv/N 
as a function of the effective temperature ksT /shh- Lower 
panel: Averaged radii of gyration per monomer, {Rg) /N ^ 
as a function of ksT /ehh^ for the 48mer interacting with a 
surface which attracts all monomers with different strengths 
(surfaces Al, A2 and A'^/2). Note that ksT is scaled with the 
internal attraction strength, £hh^ so as to compare different 
systems in the same energy scale. In this manner, any differ- 
ence in quantities comes solely from the surface strengths ssh 
and Ssp- Errors smaller than the data points are not shown. 



While the two peaks in the specific heat of surface ^41/2 
tend to give the impression that it has the same quali- 
tative folding behavior as the previous cases, the shape 
of the radius of gyration clearly distinguishes it from the 
others, apart from showing that the ground state is now 
three-dimensional. This is the first clue that the specific 
heat alone does not provide a complete picture of struc- 
tural transition behavior. Indeed, the transition hierar- 
chy of the 48mer interacting with surface ^41/2 is different 
from surfaces Al and A2^ which can only be verified by 
examining other structural parameters as we shall see in 
the following section. 
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B. Identification of transition categories by 
canonical analysis of specific heat and structural 
quantities 

Three major transition categories were identified from 
the 24 systems presented in Table |I] by considering the 
combined patterns of the specific heat Cy/N and the 
average radius of gyration {Rg) /N. 

Category I: Cy shows two peaks, a bump between the 
peaks might be possible, (Rg) shows a maximum between 
these two peaks. 

Category II: Cy shows two peaks, (Rg) decreases 
upon cooling. In the very low temperature regime, it 
might rise back up a little to form a minimum when the 
temperature approaches zero. 

Category III: Cy shows only one peak with possible 
shoulders, {Rg) decreases upon cooling. 

The different combinations of Cy and {Rg) in the tran- 
sition categories are caused by the different order of oc- 
currence in folding processes, which are revealed by fur- 
ther investigation of proper structural parameters and 
their derivatives. Quantities which are particularly in- 
formative for our systems include the derivatives of the 
average number of H-H interactions, d{nHH) /dT, and 
those of the numbers of surface contacts, d {ush) /dT and 
d {nsp) /dT. A peak in d {uhh) /dT signals the construc- 
tion of H-H interactions to form a hydrophobic core (H- 
core formation). Peaks in d{nsH) /dT and d{nsp) /dT 
provide information about the formation of surface con- 
tacts, which is associated with the adsorption process as 
well as the "fiattening" of structure due to surface at- 
traction. 

We observe that {Ree) behaves quite similarly as {Rg) 
but is less reliable at low temperature where mainly com- 
pact structures are found. For this reason our analysis 
relies on {Rg) rather than on {Ree)- 



1. Folding behavior with a strong attractive surface: 
Category I 

Figure [3] shows a typical transition pattern of cate- 
gory I demonstrated by the 67mer with surface ^42, for 
which €hh = ^^£SH = £sp = 2. It is characterized 
by two pronounced peaks in Cy, with {Rg) attaining 
its maximum between them as seen in the upper panel 
of the figure. The nature of transitions to which the 
two peaks in Cy correspond can be identified by com- 
paring them with d{nHH) /dT and d{nsH) /dT in the 
lower panel. Since the surface attracts both types of 
monomers equally, d{nsp) /dT shows similar behavior 
as d {ush) /dT and thus is not shown in the figure. The 
Cy peak at kpT/eHH ^ 4.4 represents the desorption- 
adsorption transition where d{nsH) /dT peaks at the 
same temperature. The Cy peak at kpT/eHH ^ 0.4 rep- 
resents the H-core formation as d{nHH) /dT also shows 
a peak at that position. {Rg) decreases most rapidly 
during this latter process when temperature is lowered. 
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FIG. 3. (Color online) Thermodynamics of the 67mer in- 
teracting with surface A2 (shh — ^^ssh — ssp — 2), which 
shows a typical category I transition. Upper panel: Specific 
heat, Cv/N, and the average radius of gyration per monomer, 
{Rg) /N, as a function of the effective temperature ksT /shh- 
The horizontal arrows beside the labels indicate the axes to 
which the quantities refer. Middle panel: Typical config- 
urations at different temperatures. Lower panel: Deriva- 
tives of the average numbers of H-H contacts per monomer, 
(1/N)d {uhh) /dT, and that of the average number of surface 
contacts of H monomers per monomer, (1/N)d {ush) /dT, as 
a function of ksT /shh- Errors smaller than the data points 
are not shown. 



A closer look at Cy in Fig. [3] shows a weak bump be- 
tween kpT /shh ^1.0 and 3.5. The same phenomenon is 
also observed in d {ush) /dT (and d {nsp) /dT), suggest- 
ing that a subtle process related to the interaction with 
the surface is taking place in this temperature range. Re- 
calling our previous discussion on the radius of gyration 
{Rg) in Section |IVA| this is a region where the HP chain 
keeps forming contacts with the surface until it adsorbs 
completely on the surface. We call this process a "fiat- 
tening" of the structure. When the surface attraction is 
sufficiently strong, it occurs right after the chain is ad- 
sorbed to the surface but before the H-core formation. 
The top graph in the leftmost column in Table ITTl shows 



a similar case for the 48mer, shh = ^^^sh = ^sp = 2. 

However, there are cases where this "flattening" bump 
is not observed in Cy, as seen from the other two exam- 
ples for the 67mer and 103mer with different surface at- 
tractions in Table [Til The flattening process might have 
been "integrated" within adsorption, or it simply does 
not cause enough energy fluctuations to give a visible sig- 
nal in Cv- In the latter case, signals can be found in other 
structural quantities like d{nsH) /dT or d{nsp) /dT. 



2. Folding behavior with a moderately attractive surface: 
Category II 

Figure 2] shows the thermodynamics for the 103mer 
with surface PV2, a typical case in category II. Simi- 
lar to category I, systems in category II also show two 
pronounced peaks in Cy and identification of structural 
transitions depends on the derivatives of (uhh)^ {^sh) 
and (nsp). The peak at kBT/sHH ^ 0.85 represents 
the desorption-adsorption transition as identified by the 
peaks in d{nsH) I dT and d{nsp) /dT. Another peak at 
kBT/sHH ^ 0.42 indicates the H-core formation as sig- 
naled by a peak in d (uhh) /dT. 

The feature that differentiates category II from cate- 
gory I is the absence of a maximum for {Rg) between the 
two peaks in Cy • It decreases upon cooling until the very 
low temperature regime. The difference arises from the 
fact that the flattening of structures occurs at a lower 
temperature than the H-core formation in the vicinity of 
a less attractive surface, giving rise to another transition 
hierarchy than category I. Two possibilities for {Rg) are 
then observed when the temperature is further lowered: 
(a) it keeps descending as in Fig. H) (b) it rises back up 
until T = 0, forming a minimum below the H-core for- 
mation temperature as the 48mer does in Table [III (top 
graph, middle column for category II). 

Interesting observations at low temperature are 
revealed by the thermodynamics of d{nHH)/dT^ 
d (tish) /dT and d (nsp) /dT as shown in the lower panel 
of Fig. m During H-core formation at kBT/sHH ^ 0.42 
where d{nHH) /dT peaks at, troughs are observed in 
d {tish) /dT and d (nsp) /dT. This is a process of "thick- 
ening" during which some of the surface attachments 
have to be broken to facilitate the construction of H-H 
interactions. 

When the temperature is further lowered to 
ksT/sHH ^ 0.25, a subtle shoulder could barely be 
seen in Cy and {Rg) stays still on cooling; d{nsp) /dT^ 
however, shows a clear peak. This suggests that surface 
contacts for the P monomers are established, demon- 
strating the flattening effect. Eventually the structures 
with minimal possible energy are attained but they 
no longer span as many layers vertically as at higher 
temperature. These structures are not completely planar 
as in category I, as forming surface contacts is not always 
more energetically favorable than forming hydrophobic 
H-H interactions. 
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FIG. 4. (Color online) Thermodynamics of the 103mer in- 
teracting with surface P^/2 {shh — '2,esH = 0,ssp — 1), 
which shows a typical category 11 transition. Upper panel: 
Specific heat, Cv/N, and the average radius of gyration per 
monomer, (Rg) /N, as a function of the effective tempera- 
ture ksT /ehh- The horizontal arrows beside the labels indi- 
cate the axes to which the quantities refer. Middle panel: 
Typical configurations at different temperatures. Lower 
panel: Derivatives of the average numbers of H-H contacts 
per monomer, {1/N)d (uhh) /dT, and those of the numbers 
of surface contacts, (1/N)d (ush) /dT and (1/N)d (nsp) /dT, 
as a function of ksT/sHH, respectively. Errors smaller than 
the data points are not shown. 



In many other examples of this transition category (see 
Table Cy only has two major transition peaks, some- 
times with a subtle shoulder or a spike merged into the 
peaks as a result of a combination of various events. In- 
dividual investigation of structural measures is thus es- 
sential to segregate different structural changes. 
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FIG. 5. (Color online) Thermodynamics of the 48mer inter- 
acting with surface {shh = ^^esH = 0,ssp = 1), which 
shows a typical category III transition. See Fig. [4] for figure 
explanations. 



3. Folding behavior with a weak attractive surface: 
Category III and beyond 



When the surface attractive strength further reduces, 
the adsorption and flattening temperatures decrease ac- 
cordingly. Category III is identifled when the adsorption 
transition coincides with H-core formation, giving a sin- 
gle peak in Cy associated with a shoulder in some cases 
like the example shown in Fig. [5l The thermodynam- 
ics of category III transitions looks similar to that in 3D 
free space. In both cases, (Rg) decreases upon cooling 
and Cv peaks at nearly the same temperature, except 
that a higher peak results for systems of category III. 
Since adsorption and H-core formation now occur almost 
together at nearby temperatures, more conformational 
degrees of freedom are introduced by the surface inter- 
actions and this higher entropy gain results in a larger 
Cv 

Details of transitions are again provided by 
d{nHH)/dT^ d{nsH)/dT and d{nsp)/dT. From 
the positions of peaks in d{nsH) I dT and d{nsp) /dT^ 



one may identify adsorption at kBT/sHH ^ 0.63 and 
flattening at kBT/sHH ^ 0.24 respectively. d{nHH) /dT 
manifests a wide peak across the adsorption and flatten- 
ing temperatures, which suggests that the hydrophobic 
core is formed roughly in the temperature range 
kpT /shh ^ 0.25 — 0.87. Instead of producing individual 
peaks in Cy, the signals of adsorption and flattening are 
"bridged" and smoothed out by the H-core formation, 
giving only a peak with a shoulder in Cy, the shape 
and location of the latter being often subject to some 
variability. Therefore, it is necessary to rely again on 
structural quantities to separate signals for the various 
transitions as illustrated in the previous categories. 

For very weak attractive surfaces, adsorption and flat- 
tening occur at even lower temperatures. They become 
distinguishable from the H-core formation which takes 
place at a higher temperature, forming two or even three 
distinct peaks in Cy- We generally classify systems with 
this transition hierarchy as category IV. For a detailed 
discussion of an example from this category, see [53|, |6l| . 



V. DISCUSSION 

A. Remarks on the structural measures and 
categories 

Our results demonstrate that a comprehensive analysis 
of the speciflc heat (Cy) combined with a set of appro- 
priate structural quantities is essential to shed light on 
recognizing structural transformations, especially those 
subtle ones for which Cy alone provides insufficient in- 
formation. We also note that in identifying phase tran- 
sitions, the peaks observed in structural quantities and 
those in Cy might be slightly off. One possible expla- 
nation could be finite size effects [62|. Nevertheless, this 
does not affect our identification scheme much as these 
shifts are sufficiently small compared to the difference in 
temperature scales required to clearly identify distinct 
categories of transition patterns. 

Table [III further summarizes and compares the ther- 
modynamic features for the three categories discussed in 
the previous section. We have intentionally chosen exam- 
ples of various combinations of surface types (i.e. surface 
strengths) and chain lengths for each category to illus- 
trate that the classification scheme is generic regardless 
of these variations. 



B. Classification of categories using relative surface 
attraction strengths 

Our classification scheme effectively generalized the 
folding behavior into certain transition hierarchies. We 
further observe that the dominating factor determining 
the transition category is closely associated with the rel- 
ative surface attraction, specifically, the ratio between 
^SH + ssp and shh- 
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TABLE II. Characteristic thermodynamics for Categories I, II, and III: Specific heat, Cv/N, and average radius of gyration per 
monomer, (Rg) /N, as a function of the effective temperature ksT /shh- Errors smaller than the data points are not shown. 



Category I 



Category II 



Category III 



48mer, shh = 1, ssh = 2, ssp = 2 
0.8| ' ' \0.12 




48mer, shh = 2, ssh 1, ssp = 1 
1.2| ' ' I \0m 



36mer, ehh = 3, esH = 1, £sp = 1 




0.06 



/A^V 0.04 



0.02 



1.2 



0.4 



-0.08 



0.06 



67mer, shh = 1, £sh = 2, e^p = 
0.8| ' ' I ^ ^0.12 

Cv/N/ 



67mer, shh = 1,£sh = 0, ssp = 1 



0.5 



1.0 



0.4 




0.08 



1.2 



0.4 



lOSmer, = = 1,£sp = 1 

0.81 ' ' I ^0.12 



0.08 



0.06 



0.04 



0.02 




103mer, shh = 1,£sh = l,£sp = 




1-0 , ^, 2.0 



0.08 



0.04 



1.2 



0.4 







■ I' 






Cv/N\ 







).0 0.4 , ^,0.8 



0.08 



0.06 



0.04 



0.02 



48mer, shh = 2, ssh = 1, £sp = 
I.2F Cv/N^p^ JO.08 

0.06 

0.04 



lOSmer, shh = 2, ssh = 1, £sp = 
1.2- 

0.8 

0.4 

O.Q 







0.08 


Cv/Nf 


\ <R >/N 


0.06 






0.04 





0.5 1.0 



shows two peaks, one for adsorp- 
tion at high T, one for H-core for- 
mation at low T 

might show a bump for flattening 
between the two peaks 



shows two peaks, one for adsorp- 
tion at high T, one for H-core for- 
mation at low T 

might have a shoulder or a spike 
for flattening below the H-core for- 
mation T 



shows only one peak for a com- 
bination of adsorption, flattening 
and H-core formation 

peak might have one or more 
shoulders for processes taking 
place at nearby T 



shows a global maximum between 
the two Cv peaks, signifying the 
end of flattening and the start of 
H-core formation upon cooling 



decreases upon cooling, might 
form a local maximum between 
the two Cv peaks 

might rise back up due to flatten- 
ing after H-core formation, form- 
ing a minimum at low T 



keeps decreasing upon cooling 

might rise back up due to flatten- 
ing after H-core formation, form- 
ing a minimum at low T 
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TABLE III. Distribution of transition categories with respect 
to the relative surface attractions. The abbreviations refer to 
the surface types introduced in Table [H The numbers are the 
short forms of the benchmark sequences (e.g. 36 stands for 
the 36mer, 48.1 and 48.9 correspond to Seq. 1 and 9 among 
the ten 48mers, respectively). 



+ 

£SP 


Category I 


Category II 


Category III 


> shh 


Al: 36, 48.1, 
48.9, 64, 
67, 103 

A2: 36, 48.9, 
64, 67 

H2\ 48.1, 48.9, 
64, 67 

P2: 48.1, 48.9, 
64, 67 






— Shh 


^1/2: 
PI: 


103 
103 

^1/2: 36, 48.1, 
48.9, 64, 
67 

HI: 48.1, 48.9, 
64, 67, 
103 

PI: 48.1, 48.9, 
64, 67 




< Shh 




m/2'. 67 

PV2: 67, 103 


m/2: 48.1, 48.9 
64, 103 
PV2: 48.1, 48.9 
Ai/s^: 36, 64 



Table Uni shows the distribution of transition categories 
against the relative surface attractions for systems with 
various chain lengths and surface types. Ideally, a perfect 
correspondence between the transition categories and the 
relative surface attractions is implied if only the diagonal 
compartments were filled in Table lllli In reality, as ther- 
modynamic subtleties vary from sequence to sequence, 
some off-diagonal compartments are also occupied (e.g., 
some systems with esh -\- £sp < ^hh show category II 
behavior). A few systems also reveal "category duality" 
where the thermodynamics bears properties from both 
categories (e.g. the lOSmer interacting with surfaces ^41/2 
or PI). Nonetheless, the generality of our classification 
scheme is clearly apparent and allows us to infer the fol- 
lowing basic rules: transition category I occurs for sur- 
faces which are strongly attractive {ssh ^ssp >^ shh)] 
category II occurs when the hydrophobic internal attrac- 
tion is approximately comparable to the surface strengths 
{esH £sp ^ ^hh)] category III can only occur when 
surface strengths are relatively weak compared to ehh 
{^SH -\- £sp < £hh)' Besides, we have also investigated 
the influence of the proportion of monomers by consider- 
ing the ratio between {nsH /N)esH + {nsp/N)esp and 
^HH- However, we did not find an obvious relation be- 



tween these quantities and the transition categories. We 
thus believe that esH + ^S'P is more suitable for the clas- 
sification scheme. 

We stress that unlike other existing work, this clas- 
sification is an inference based on multiple HP se- 
quences of various chain lengths and attributes. In 
Table HIH we have also included results from three 
other sequences, which were used as a "testing set" 
for the adequacy of the classification scheme: a 36mer 
(P3H2P2H2P5H7P2H2P4H2P2HP2); another 48mer (HP- 
H2P2H4PH3P2H2P2HPH3PHPH2P2H2P3HP8H2); and 
a 64mer (H12PHPHP2H2P2H2P2HP2H2P2H2P2HP2H2- 
P2H2P2HPHPH12). The 36mer and the 64mer were 
originally proposed to test a 2D genetic algorithm [26|, 
whereas the 48mer is Seq. 1 of the ten testing sequences 
in • All results from these additional sequences fall 
into the diagonal compartments in Table IIIIl reinforcing 
that our classification scheme is applicable to other se- 
quences interacting with an adsorbing substrate without 
loss of generality. This is thus a breakthrough in our un- 
derstanding of adsorption properties of lattice proteins: 
Instead of sequence-dependent individual behavior, the 
thermodynamics of HP proteins do follow common pat- 
terns in structural transitions when they interact with an 
adsorbing substrate. 

Even if esH and esp have the same magnitude, they 
are generally not expected to contribute equivalently to 
the total energy because of the competition between ssh 
and Shh for the H monomers. However, the contrary ap- 
pears to be the case in our study because of the entropic 
effects, in which the adsorbing P monomers also hinder 
the formation of H-H contacts in an indirect manner by 
dragging the H monomers to the surface. Consider the 
surface P2 case for instance: As it is energetically more 
favorable to form surface-P contacts than internal H-H 
contacts, the chain tends to sit on the surface rather than 
to form a compact globule. The H monomers are then 
restricted to sit also on the surface instead of forming 
H-H contacts, causing both d{nsH)/dT and d{nsp)/dT 
to peak at the adsorption temperature. 



VI. CONCLUSION 

In this work, protein adsorption has been studied with 
a coarse-grained lattice model, the HP model, interact- 
ing with a surface which either attracts all monomers, 
only hydrophobic monomers or only polar monomers. 
We have employed Wang-Landau sampling with two ef- 
fective Monte Carlo trial moves, pull moves and bond- 
rebridging moves, to obtain the energy density of states 
and subsequent thermodynamics of structural quantities 
for a 48mer, 67mer and 103mer. Ground state energies 
are also reported. Based on the folding and adsorption 
behavior revealed by a careful, comprehensive analysis 
of the specific heat, radius of gyration and derivatives of 
the numbers of surface contacts, we have been able to 
identify four main types of transition hierarchies (three 
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of which are discussed in detail in this work). We have 
found that the occurrence of these transition hierarchies 
is mainly determined by the attractive couplings of the 
surface relative to the internal hydrophobic attraction, 
i.e., the ratio between ssh -\- ssp and shh-, regardless of 
the surface type, chain length or composition of H and P 
monomers of an HP sequence. Three other benchmark 
sequences, a 36mer, another 48mer and a 64mer, have 
confirmed the validity of our classification scheme. 

Although other transition categories cannot be ex- 
cluded from our study, the ones presented here provide a 
general and representative picture of the thermodynam- 
ics of HP proteins interacting with an adsorbing sub- 
strate. Our study also demonstrates that classifying tran- 
sition hierarchies by a combined analysis of the specific 
heat and appropriate structural parameters provides a 
powerful route in approaching similar systems of large 
conformational and sequence spaces, for instance, HP 
proteins interacting with two confining, attractive sur- 
faces [63|,[6|. 

However, further investigation is necessary to deter- 
mine if there is a rigorous relation between the proposed 
transition categories and the relative surface attractions. 
More statistics from longer chains, or chains of the 
same length but with different H and P compositions 



would help clarifying the problem. The next question is 
whether the same conclusions could be drawn, or what 
discrepancies would be found, for other lattice models 
with other energy functions, i.e., different interactions 
between monomers and with the surface. Another 
important question is whether similar classification 
schemes could hold for off-lattice models. In this case, 
thermodynamics of other structural parameters, e.g., 
the gyration tensor, density profile, or any suitable ones, 
could help in identifying such generic transition pat- 
terns. All these together are essential in determining the 
effectiveness of using different simplified protein models 
in computer simulations to study protein adsorption 
from a macroscopic perspective. 
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